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Abstract: In the particular case we have insertions/deletions at the tail of a given set 
S of 71 one-dimensional elements, we present a simpler and more concrete algorithm 
than that presented in [Anderson, 2007] achieving the same (but also amortized) upper 
bound of 0( -y/ logd/loglogd) for finger searching queries, where d is the number of sorted 
keys between the finger element and the target element we are looking for. Furthermore, 
in general case we have insertions/deletions anywhere we present a new randomized 
algorithm achieving the same expected time bounds. Even the new solutions achieve 
the optimal bounds in amortized or expected case, the advantage of simplicity is of 
great importance due to practical merits we gain. 
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1 Introduction 

By finger search we mean that we can have a finger pointing at a sorted key 
X when searching for a key y. Here a finger is just a reference returned to the 
user when x is inserted or searched for. The goal is to do better if the num- 
ber d of sorted keys between x and y is small. Also, we have finger updates, 
where for deletions one has a finger on the key to be deleted, and for insertions, 
one has a finger to the key after which the new key is to be inserted. In the 
comparison-based model of computation Ramman [Raman, 1992] has provided 
optimal bounds, supporting finger searches in 0{logd) time while supporting 
finger updates in constant time. On the pointer machine, Brodal et al. [Brodal, 
2003] have shown how to support finger searches in 0{logd) time and finger up- 
dates in constant time. Finally, Anderson and Thorup presented in [Anderson, 
2007] optimal bounds on the RAM; namely 0{y^ logd/loglogd) for finger search 
with constant finger updates in worst-case. This optimal solution is also very 
complicated and as a consequence not at all practical. 

In this paper, assuming that the insert/delete operations occur at the tail of 
set S, we present a new algorithm based on an implicit Nested Balanced Dis- 
tributed Tree (BDT), which handles finger-searching queries in optimal amor- 
tized (and not worst-case) time {0{^Jlogd/loglogd))h\xi also in a simpler manner 
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than that presented in [Anderson, 2007] . Consequently, our method is much eas- 
ier to be implemented. 

In general case we have insertions/deletions anywhere we present a new sim- 
ple randomized algorithm based on application of oblivious on-line simple pebble 
games [Raman, 1992] upon a new 2-levcl hybrid data structure where the top- 
level structure is a Level-Linked Exponential search tree [Beam, 2002] and the 
bottom level are buckets of sub-logarithmic size. Our new randomized method 
results in the following complexities: 0{y/logd/loglogd) and 0(1) in expected 
case for finger searching and update queries respectively. 

In the following section we review the preliminary data structures. In section 
3 we review in detail an extended outline of our new solution in special case 
we have insertions/deletions at the tail of given set. In section 4 we study the 
general case we have insertions/deletions anywhere constructing a randomized 
algorithm achieving the same optimal expected time bounds. In section 5 we 
conclude. 

2 Preliminary Data Structures 

2.1 Precomputation Tables 

Ajtai, Predman and Komlos have shown in [Ajtai, 1984] that subsets of the 
integers {1, . . . ,n} of size polylogarithmic in n can be maintained in constant 
time so that predecessor queries (find the largest i £ S such that i < x) can 
be performed in constant time. In fact, their result is in the cell probe model of 
computation; however, on a logarithmic word size RAM their functions can be 
represented by tables that can be incrementally precomputed at a cost of 0(1) 
worst-case time and space per operation. The data structure occupies space that 
is linear in the size of the subset. 

2.2 Fusion Tree 

At STOC'90, Fredman and Willard [Fredman, 1990] surpassed the comparison- 
based lower bounds for sorting and searching using the features in a stan- 
dard imperative programming languages such as C. Their key result was an 
0{logn/loglogn) time bound for deterministic searching in linear space. The 
time bounds for dynamic searching include both searching and updates. Since 
then much effort has been spent on finding the inherent complexity of funda- 
mental searching problems. 

2.3 Amortized Exponential Search Tree 

In 1996, Anderson [Anderson, 1996] introduced exponential search trees as a 
general technique reducing the problem of searching a dynamic set in linear space 



to the problem of creating a search structure for a static set in polynomial time 
and space. The search time for the static set essentially becomes the amortized 
search time in the dynamic set. From Fredman and Willard [Fredman, 1990], 
he got a static structure with 0{\/logn) search time, and thus he obtained an 
0{\/logn) time bound for dynamic searching in linear space. Obviously the cost 
for searching is worst-case while the cost for updates is amortized. 

2.4 Beam-Fich (BF) structure 

In 2002 Beame and Fich [Beam, 2002] showed that 0{y/logn/loglogn) is the 
exact worst-case complexity of searching static set using polynomial space. Us- 
ing the above mentioned exponential search trees, they obtained a fully dy- 
namic deterministic search structure supporting search, insert, and delete in 
0{y^logn/loglogn) amortized time. The BF structure can use randomization 
(for rehashing) in order to achieve 0{loglogN) expected update time, where A'' 
is the universe. The amortized operations arc very simple to be implemented in 
a standard imperative programming language such as C or C + +. 

2.5 Worst - Ccise Exponential Seeu-ch Tree 

Finally, in 2007, Anderson and Thorup [Anderson, 2007] developed a worst- 
case version of exponential search trees, giving an optimal 0{-^/logn/loglogn) 

worst-case time bound for dynamic searching. They also extended the above 
result to finger searching problem, achieving the same optimal time bound 
0{^Jlogd/loglogd). The rebuilding operations are also very complicated and very 
difficult to be implemented in a standard imperative programming language such 
as C or C -h -h. 

3 A special case of finger searching 

We use as a base structure a Balanced Distribution Tree (BDT). In such a tree 
the degree of the nodes at level i is defined to be d{i) = t{i), where t{i) indicates 
the number of nodes present at level i. This is required to hold for i > 1, while 
d{0) = 2 and t{0) = 1. It is easy to see that we also have t{i) = t{i — 1) * d{i — 1), 
so putting together the various components, we can solve the recurrence and 
obtain for i > 1: d{i) = 2^ \ t{i) = 2^' . One of the merits of this tree is that 
its height is 0{loglogn), where n is the number of elements stored in it. 

We consider the case we have only insertions / deletions at the end of the set 
S, for example insert{y) or delete{y) such as y > maximum {xi € S}, I < i < n 
OT y = maximum {xi G S}, 1 < i < n respectively. We build our structure by 
repeating the same kind of BDT tree-structure in each group of nodes having 
the same ancestor, and doing this recursively. 



This structure may be imposed through another set of pointers (it helps to 
think of these as different color pointers). The innermost level of nesting will be 
characterized by having a tree-structure, in which no more than two nodes share 
the same direct ancestor. Figure 1 illustrates a simple example (for the sake of 
clarity we have omitted from the picture the links between nodes with the same 
ancestor) . 




1 -St nested subtree 2-nd nested subtree 3-rd nested subtree 4-th nested subtree 



Figure 1: The Level- linked leaf-oriented nested BDT tree 

Thus, multiple independent tree structures are imposed on the collection of 
nodes inserted. Each element inserted contains pointers to its representatives in 
each of the trees it belongs. 

We need now to determine what will be the maximum number of nesting trees 
that can occur for n elements. Observe that the maximum number of nodes with 
the same direct ancestor is d{h — 1). Would it be possible for a second level tree 
to have the same (or bigger) depth than the outermost one? This would imply 

thatE,tdi(j) >'^(^-l) 

As otherwise we would be able to fit all the d{h— 1) elements within the first 

h—1 levels. But we need to remember that d{i) = t{i), thus d{h—l)+^^^^ d{j) < 

d{h - 1) 



This would imply that the number of nodes in the first h—2 levels is negative, 
clearly impossible. Thus, the second level tree will have depth strictly lower than 
the depth of the outermost tree. As a consequence, the maximum number of 
nesting of trees k that we can have is itself 0{loglogn). 

The basic intuition behind the use of BDT tree, is the reduction of the whole 
set of 0{n) elements to the appropriate subset (nested subtree of figure 1) of 
0{d) elements. Then by applying in this subset the simple amortized solution 
for general searching problem presented in [Beam, 2002], we achieve an opti- 
mal amortized solution for finger searching problem. Despite the fact that the 
searching time complexity of our structure is amortized and not worst-case as it 
happens in [Anderson, 2007] solution, it's simplicity also is of great importance 
since we can gain many practical merits. 

We equip each node(leaf) of level i, say Wi, with a searching information 
array A[l . . .d{i)] (L[l . . .d(i)]), where d{i) is the size of the array at level i. 
We organize the elements of the arrays above with the structure of Beam-Fich 
presented in [Beam, 2002], lot's call it BF{Wi). We also equip each leaf with 
k = 0{loglogn) pointers to its respective copies at nested levels (see in Figure 
1 the pointers from leaf /). Each element of S is stored at most in 0{loglogn) 
levels, so the space of structure is non-linear 0{nloglogn) and the update (in- 
sertion/deletion) operation is performed in 0{loglogn) worst-case time. In order 
to achieve linear space and 0(1) worst-case update time we use the bucketing 
technique. The essence of the bucketing method is to get the best features of 
these two different structures by combining them into a two-level structure. The 
data to be stored is partitioned into biickcts and the chosen data structure for 
the representation of each individual bucket is different from the representation 
of the top-level data structure, representing the collection of buckets (for similar 
apphcations of this data structuring paradigm see also [Overmars, 1982], [Tsaka- 
lidis, 1984], [Raman, 1992]). More specifically, we partition the elements of the 
set into contiguous buckets of size 0(loglogn), with each bucket being repre- 
sented by the linear list scheme and we store the first element of each bucket in 
the leaf-oricntcd nested balanced distributed tree scheme as its representative. 
When an item is inserted it is appended to the tail of the list implementing the 
last incomplete bucket. If the size of this bucket becomes 0{loglogn), then a 
new bucket is created containing only the inserted element, and we spend fur- 
ther 0{loglogn) time, in order to insert this element into the top-level structure. 
We have a total of 0{n/loglogn) representatives, each of which must be inserted 
at most in 0{loglog{n/loglogn)) = 0{loglogn) nested levels. Furthermore, at 
each of these levels (leaf-levels) we must update the respective BF structures 
in 0{loglog{d{ni))) worst-case time respectively, where d{ni) is the size of the 
respective array L , at the n-'*, 1 < rij < 0{loglogn), level of nesting. More 
precisely the dynamic BF structure requires amortized update time but this 



special semi-dynamic case of updating implies the following: 



1. If n < 2^og^^''sN/logloglogN ^^^eu the BF structure has only one part, the 
simple static data structure presented in[Anderson, 1996]. In this case we 
must execute a number of partial rebuilding operations at the right subtrees 
only of the whole structure, ensuring always that these subtrees have size at 
least — r " I ± 1 and at most r i ± 1, as follows. When an update causes 

a right-subtree to violate this condition, we examine the sum of the sizes of 

that subtree and its immediate neighbor which is always a full subtree with 
I - ± 1 elements, transferring the proper number of elements from the 



full neighbor node to the right-most one which we try to reconstruct. Until 
the next reconstruction we have all the time to spread incrementally the 

reconstruction cost, achieving 0{1) worst-case time. So, for the 0{loglogn) 
levels of the tree depicted in figure 1 the total amount of update time becomes 
0{loglogn) in worst-case. 

2. If n > 2^ogHoaN/iogiogiogN ^logn/loglogn > loglogN / {y/2logloglogN) 
the BF structure consists of two parts. The first part is a a; — fast trie of 
Willard [Willard, 1983] with branching factor 2k and depth u which orga- 
nizes the top 1 + 2 * \logu\ levels for a set of s < n strings with length 
u, {u = 2{loglogN) / (logloglogN) ^/n > > logN) over the alphabet 
[0, 2k — 1] . Intuitively the x — fast trie reduces the predecessor and gen- 
erally the dictionary problem from a universe of size 2^ to a subproblem 
with universe of size 2^ where k = {logN)/2^+'^^'°9"^ < {logN)/2v? < w"-^, 
[2{u -If -\\k < logN < 6 and 6 > [2(m - l)^ - ij A; . The second part 
consists of the appropriate hash functions constructed for each resulting sub- 
problem. When an insertion/deletion is occurred we have to insert/delete the 
appropriate hashed values. Since we investigate the special case where the 
updates occur at the tail only, the update of the hash functions described 
above can be done in 0(1) worst-case time. So, for the 0{loglogn) levels of 
the tree depicted in figure 1 the total amount of update time becomes again 
0{loglogn) in worst-case. 

Due to the fact that dijii+i) = d{ni) at level i, the total amount of update 
operations at the appropriate BF structures can be expressed as follows: 



0{loglog{d{ni))) + 0{loglog{^ d{n^))) + 0{logloy{y y^d(ni))) -!-... = 0{loglogn) 
Spreading the total 0{loglogn) insertion cost, over the 0{loglogn) size of each 
bucket, we achieve an 0(1) amortized insertion cost. For the same reason as 
above it is easy to prove that the whole space is linear. We eliminate the amor- 
tization by spreading the time cost for the insertion of the representative over 
the next 0{loglogn) updates of bucket. Due to the fact that we have no a priory 
knowledge of n, we use the global rebuilding technique [Overmars, 1981] in order 





to retain the buckets in a appropriate size of 0{loglogn), where n is the current 
number of elements. The question is: has any affect to the search{f, s) query the 
fact that the time, in which the query is performed, the incremental process and 
consequently the insertion of the bucket's representative in all possible nested 
levels, has not finished yet? In the following lemma wc build the appropriate 
algorithm and we show that there is no possibility of such an affect. 

Lemma 1. The search*{f, s) operation is correct and requires 0{^/ (logd/loglogd)) 
amortized time 

Proof. Let's give the new search* {f, s) algorithm. 
r/= representative of bucket in which finger / belongs to 
rs= representative of bucket in which s belongs to 
r„=representative of not full bucket 

Procedure Search* {f,s) 

1. Begin 

2. If f, s belong to same bucket (full or not) or s > r„ then access directly s 

3. else fsearch{rf,rs) /* this procedure follows */ 

4. End 

Procedure fsearch{f, s) 
1. Begin 



2. W =Father(f) 

3. If s < Ayj[rightmost] then go to LI /* f,s have the same parent */ 

4. Else Begin 

5. Repeat 

6. Wl=Father(W) 

7. If A,^i[rightmOSt] < S < Aneighbourwl[rightrnosl] 

/* that means f,s belong to neighbors nodes Wl and neighbourWl respec- 
tively */ 

8. then fsearch{leftmostleaf{Tneighbourwi),s) 

9. Until s < Awi[rightmost] 



10. go to L2 

11. end 

12. LI: Begin 

13. j:= -1, f=L[i] 

/* Find the appropriate nested subtree such as Father{f) ^ Father{s) */ 



14. 


Repeat 








15. 


j=,i+i 








16. 


Until s< A 






22^+22^" 


17. 


Access the {j 


+ 1)*'* copy of f (/j+i 



/* by Following the (j + 1)*'* pointer from finger(leaf) / 

18. fsearch{fj+i,s) 

19. End 

20. L2: Begin 

21. j:=0 

22. Repeat 

23. j:=j+l 

24. search for s in BF{Wj) structure 

/* At each node of the Wi , T^2 , • • • , T^fc , s path search for s a.t BF{Wi), BF{Wk) 
structures respectively */ 

25. until s is found 

26. end 

27. END 

1. Search* {f,s): According to [Ajtai, 1984] the statement 2 requires 0(1) 
worst-case time. In statement 3 we call the procedure fsearch{f, s) the com- 
plexity of which is analyzed as follows. 



2. fsearchif, s): When f,s have the same parent (see f,sl in figure 1), state- 
ment 3, wc must determine the appropriate nested-subtree of 0{d) elements 
in which f,s do not belong to the same collection. So, in repeat-loop 14-16 
we execute exponential steps in order to find an appropriate value j which 
defines the collection (of 2^' elements) in which the distance d{ f, s) belongs 
to and consequently the appropriate {j + 1)*'' pointer from finger (leaf) / to 
its respective copy fj+i- Then we call recursively the same routine (state- 
ment 18). Obviously the repeat-loop 14-16 requires 0{loglogd) steps due to 
the fact that the distance d between / and s is at least d>2'^'. From finger 
/ we have a number oik = 0{loglogn) pointers, so by organizing them in a 
structure of [Ajtai, 1984] we can access the (j -I- 1)*'^ pointer in 0(1) time. 
If f,s do not have the same parent wc execute the repeat-loop of 5-9 state- 
ments that requires 0{loglogd) steps in order to find the nearest common 
ancestor of / and s, Wi = nca{f, s). If f,s belong to neighbors nodes Wi and 
neighbourWi respectively, (statement 7) we access the neighbourWi node 
in 0(1) time by following the neighbor pointer from Wi to neighbourWi 
and we call recursively the same search routine with new finger the left- 
most leaf of the TneighbourWi subtree, else by executing the repeat-loop of 
22-26 statements, we visit the appropriate search path Wi,W2, . . . , Wr, s at 
each node of which we search for s at BF{Wi) structures, 1 < i < r and 
r = 0{loglogd),'m 0{^logd{wi)/loglogd{wi)) amortized time, where d{wi) 
is the degree of node Wi. This can be expressed by the following sum: 

Er—0{loglogd) / logd{wi) 
i=l y loglogd{wi) 

Let Li, Lr the levels of Wi and Wr respectively. So, d{'Wi) = 2^^^ and 

d{Wr) = 22'"' 

But, d{wr) = 0{d), so Lr = 0{loglogd). Now, the previous sum can be 

expressed as follows: 

/ 2^1 I / 2^1 + ^ I I / logd _ j logd 

y Li y Y loglogd ~ y loglogd 

We denote that the recursive calls of statements 8, 18 are executed one time 

only (this fact stems from the pseudocode structure we used), consequently 
there is no reason to produce and solve the respective recurrence equation, 
so, very simply the total time becomes T = 0{J j^^^)- 



4 A randomized algorithm with the same expected time 
bounds 

Let's give a brief description of the combinatorial pebble games we have to rely 
on for constructing our new solution. 

Pebble Games [Raman, 1992]: These games are played between two players. 



player J(incrcascr) and player £)(decreaser) on a set of n piles of pebbles, which 
are initially empty. These games have the following general form: the game is 
played in rounds, each consisting of one move from each player. Player /, on his 
move, increases the number of pebbles on of some of the piles, following which; 
player D decreases the number of pebbles on some pile. Let M be the maximum 
value of any variable at any point in the game. Player I's objective is to maximize 
M, and player D' s to minimize it. Typically, player D is an algorithm and player 
/ the environment. 

Oblivious Pebble Games [Raman, 1992]: In this type of game player / 
reveals his moves one at a time to player D, but player D' s moves (and the status 
of the piles) are hidden from him. Player D may use randomization to make his 
moves unpredictable to player /. Here we are interested either in the expected 
value of M or in studying the tails of M' s distribution. Also, we typically restrict 
the number of moves this game is played, since, as it so happens, the longer the 
game is played, the more likely it is that player I will come close to approaching 
his performance in the on-line version of the game (for more details you can 
also see [Raman, 1992]). According to Oblivious On-line Discrete Zeroing Game 
[Raman, 1992] there is a D-strategy that ensures with high probability {p > 
1 — n~", for any constant a > 0, for sufficiently large n) that over n moves, 
M g 0{doglogn + dogc), where c is an integer, c > 1. This strategy is described 
from the following algorithml: 

Algorithml: Let c > 1 an integer and 6i,...,5n non- negative integers such 
that Si = c. Then player D, on his move, does the following: 

1. Picks i {1, . . . , n} with probability 5i/c and sets a;, to zero 

2. Picks i such that Xi = maxj {xj} and zeroes Xi. 

For c = 0{loglogn), M e 0{log^logn) with high probability. Based on D- 
strategy of Algorithml let's describe our randomized Algorithm2: 

Algorithm2: Let n be the maximum number of keys present in the data 
structure at any previous time. In a similar way with that presented in [Raman, 
1992], we can show that making the buckets be of size 0{log^logn) and using as 
top-level the structure of Beam-Fich presented in [Beam, 2002] with level-links 
suffice for our purposes, yielding a simple algorithm. We define the fullness <?(6) 
of a bucket b as in [Raman, 1992]: 
^{b) = \b\ /logHogn. We will ensure that 0.5 < {b) < 2. 
We also define the criticality of a bucket b to be 

p{b,n) = aiogiogn ^c-^ {Oi 0-7log^logn — \b\ , \b\ — l.Slog^logn^, for an appropriately 
chosen constant a. A bucket b is called critical if p{b, n) > 0. To maintain the 
size of the buckets, every c = aloglogn updates, we do the following: 

1. We check the i*'* bucket, i G {l, . . . ,n/log'^logn}, with probability (5,/c 
meaning that we construct a randomized set of c = 0{loglog{n/ logHogn)) = 



0{loglogn) collections each of which has 0{n/log^logn) buckets, we choice 
one of these collections randomly and finally the bucket of collection in which 
6i — maxj {6j} updates have occurred. If this bucket has non-zero criticality 
we apply the rebalancing transformations of step 3. 

2. We check the most critical bucket and if it has non-zero criticality we apply 
the following rebalancing transformations. 

3. Split: if > 1.8 split the bucket into two parts of approximately equal 
size. 

Transfer: If (j){b) < 0.7 and one of its adjacent buckets b' has 4'{b') > 1 then 
transfer elements from b' to b. 

Fuse: If ^(6) < 0.7 and transferring is not possible, then fuse with an adja- 
cent bucket b'. 

It is clear that when a critical bucket is rebalanced, it becomes non-critical. 
In addition to the time required to split/fuse buckets, a bucket rebalancing step 
may require 0{loglogN) expected time to insert/delete a bucket representative 
to/from the top-level tree. The top-level tree is the BF structure, which supports 
updates in 0{loglogN) expected time. Since the total work to rebalance a bucket 
is 0{loglogN), we can perform it with 0(1) work per update spread over no 
more than aloglogn updates, where the chosen parameter a expressed as follows: 
a = For every real computer application N never exceeds the number 

264 _ ^ thus a could be considered as a constant much less than 6. So, if we 
can permit every bucket to be of size 0{log^logfi), where h the number of current 
elements, we can guarantee that between rebalancing operation of top-level tree 
[Beam, 2002] there is no possibility for any other such operation to occur and 
consequently the incremental spread of work is possible. Let p be a finger. We 
search for a key k which is d keys away from p. If p,k belong to the same bucket 
of size 0{log^logn), we can access directly the k according to [Ajtai, 1984], else 
we first check whether (representative of bucket in which k belongs to) is to 
the left or right of Tp, (representative of bucket in which finger p belongs to) say 
Tfe is to the right of r^. Then we walk towards the root, say we reached node 
u. We check in 0{^/logd/loglogd) time whether rfc is a descendant of u or u's 
right neighbor on the same level of u or u's right neighbor respectively. If not, 
then we proceed to u's father. Otherwise we turn around and search for k in the 
ordinary way. 

Suppose that we turn around at node w of height h. Let v be that son of w 
that is on the path to the finger p. Then all descendants of v's right neighbor 
lie between the finger p and the key k. The subtree Tyj is a BF structure for d 
elements, so, the total time bound T becomes: 
T = 0{-s/logd/loglogd) 
So, we proved the following theorem: 



Theorem 2. There is a randomized algorithm with 0(1) and OCy fogfogd ) 
pected time for update and finger searching queries respectively. 

5 Conclusions 

In this paper we focused on the finger searching problem. In special case we 
have insertions / deletions at the tail of a given set S, we presented an extended 
outline of a simpler algorithm than that presented in [Anderson, 2007] matching 
the optimal upper bound in amortized case. Finally, in general case we have 
insertions / deletions anywhere; we were based on a special combinatorial peb- 
ble game presented in [Raman, 1992] in order to present a simple randomized 
algorithm that achieves the same optimal expected bounds. Even the described 
solutions achieved the optimal bounds in amortized and expected case respec- 
tively, the advantage of simplicity is of great importance due to practical merits 
we can gain. 
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